其他
s03 - Complementing a Strand of DNA
Problem
In DNA strings, symbols ‘A’ and ‘T’ are complements of each other, as are ‘C’ and ‘G’.
The reverse complement of a DNA string ss is the string scsc formed by reversing the symbols of ss, then taking the complement of each symbol (e.g., the reverse complement of “GTCA” is “TGAC”).
Given: A DNA string ss of length at most 1000 bp.
Return: The reverse complement scsc of ss.
Sample Dataset
AAAACCCGGT
Sample Output
ACCGGGTTTT
Solution
这道题要求将DNA进行反向互补。
C version
C语言的版本,我用了链表,在读DNA序列的时候,就以反向互补的形式存在链表里,然后就是遍历链表,打印出来。
#include<stdio.h>
#include<stdlib.h>
typedef struct ntNode {
char NT; /* nucleotide */
struct ntNode *next;
} ntNode;
int main() {
FILE *INFILE;
INFILE = fopen("DATA/rosalind_revc.txt", "r");
ntNode *head, *curr;
head= NULL;
char nt;
while ( (nt = fgetc(INFILE)) != EOF) {
curr = malloc(sizeof(ntNode));
switch(nt) {
case 'A':
nt = 'T';
break;
case 'C':
nt = 'G';
break;
case 'G':
nt = 'C';
break;
case 'T':
nt = 'A';
break;
default:
nt = ' ';
}
curr->NT = nt;
curr->next = head;
head = curr;
}
curr = head;
while(curr) {
printf("%c", curr->NT);
curr = curr->next;
}
printf("\n");
return 0;
}
Python version
Python就容易多了,用seq[::-1],就可以反向,然后用词典替换互补碱基。
#!/usr/bin/env python3
fh = open("DATA/rosalind_revc.txt", "r")
seq = fh.read().strip()
dict={'A':'T',
'T':'A',
'C':'G',
'G':'C'}
res = ''.join([dict[c] for c in seq[::-1]])
print(res)